{
 "cells": [
  {
   "cell_type": "raw",
   "id": "19cc5b11-3822-454b-afb3-7bebd7f17b5c",
   "metadata": {},
   "source": [
    "---\n",
    "sidebar_position: 1\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e17a273-bcfc-433f-8d42-2ba9533feeb8",
   "metadata": {},
   "source": [
    "# Semantic layer over graph database\n",
    "\n",
    "You can use database queries to retrieve information from a graph database like Neo4j.\n",
    "One option is to use LLMs to generate Cypher statements.\n",
    "While that option provides excellent flexibility, the solution could be brittle and not consistently generating precise Cypher statements.\n",
    "Instead of generating Cypher statements, we can implement Cypher templates as tools in a semantic layer that an LLM agent can interact with.\n",
    "\n",
    "![graph_semantic.png](../../../static/img/graph_semantic.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e811ebad",
   "metadata": {},
   "source": [
    "## Setup\n",
    "#### Install dependencies\n",
    "\n",
    "```{=mdx}\n",
    "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n",
    "import Npm2Yarn from \"@theme/Npm2Yarn\";\n",
    "\n",
    "<IntegrationInstallTooltip></IntegrationInstallTooltip>\n",
    "\n",
    "<Npm2Yarn>\n",
    "  langchain @langchain/community @langchain/openai neo4j-driver zod\n",
    "</Npm2Yarn>\n",
    "```\n",
    "\n",
    "#### Set environment variables\n",
    "\n",
    "We'll use OpenAI in this example:\n",
    "\n",
    "```env\n",
    "OPENAI_API_KEY=your-api-key\n",
    "\n",
    "# Optional, use LangSmith for best-in-class observability\n",
    "LANGSMITH_API_KEY=your-api-key\n",
    "LANGCHAIN_TRACING_V2=true\n",
    "```\n",
    "\n",
    "Next, we need to define Neo4j credentials.\n",
    "Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database.\n",
    "\n",
    "```env\n",
    "NEO4J_URI=\"bolt://localhost:7687\"\n",
    "NEO4J_USERNAME=\"neo4j\"\n",
    "NEO4J_PASSWORD=\"password\"\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e8fbc2c-b8e8-4c53-8fce-243cf99d3c1c",
   "metadata": {},
   "source": [
    "The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c84b1449-6fcd-4140-b591-cb45e8dce207",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Schema refreshed successfully.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import \"neo4j-driver\";\n",
    "import { Neo4jGraph } from \"@langchain/community/graphs/neo4j_graph\";\n",
    "\n",
    "const url = Deno.env.get(\"NEO4J_URI\");\n",
    "const username = Deno.env.get(\"NEO4J_USER\");\n",
    "const password = Deno.env.get(\"NEO4J_PASSWORD\");\n",
    "const graph = await Neo4jGraph.initialize({ url, username, password });\n",
    "\n",
    "// Import movie information\n",
    "const moviesQuery = `LOAD CSV WITH HEADERS FROM \n",
    "'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n",
    "AS row\n",
    "MERGE (m:Movie {id:row.movieId})\n",
    "SET m.released = date(row.released),\n",
    "    m.title = row.title,\n",
    "    m.imdbRating = toFloat(row.imdbRating)\n",
    "FOREACH (director in split(row.director, '|') | \n",
    "    MERGE (p:Person {name:trim(director)})\n",
    "    MERGE (p)-[:DIRECTED]->(m))\n",
    "FOREACH (actor in split(row.actors, '|') | \n",
    "    MERGE (p:Person {name:trim(actor)})\n",
    "    MERGE (p)-[:ACTED_IN]->(m))\n",
    "FOREACH (genre in split(row.genres, '|') | \n",
    "    MERGE (g:Genre {name:trim(genre)})\n",
    "    MERGE (m)-[:IN_GENRE]->(g))`\n",
    "\n",
    "await graph.query(moviesQuery);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "403b9acd-aa0d-4157-b9de-6ec426835c43",
   "metadata": {},
   "source": [
    "## Custom tools with Cypher templates\n",
    "\n",
    "A semantic layer consists of various tools exposed to an LLM that it can use to interact with a knowledge graph.\n",
    "They can be of various complexity. You can think of each tool in a semantic layer as a function.\n",
    "\n",
    "The function we will implement is to retrieve information about movies or their cast."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d1dc1c8c-f343-4024-924b-a8a86cf5f1af",
   "metadata": {},
   "outputs": [],
   "source": [
    "const descriptionQuery = `MATCH (m:Movie|Person)\n",
    "WHERE m.title CONTAINS $candidate OR m.name CONTAINS $candidate\n",
    "MATCH (m)-[r:ACTED_IN|HAS_GENRE]-(t)\n",
    "WITH m, type(r) as type, collect(coalesce(t.name, t.title)) as names\n",
    "WITH m, type+\": \"+reduce(s=\"\", n IN names | s + n + \", \") as types\n",
    "WITH m, collect(types) as contexts\n",
    "WITH m, \"type:\" + labels(m)[0] + \"\\ntitle: \"+ coalesce(m.title, m.name) \n",
    "       + \"\\nyear: \"+coalesce(m.released,\"\") +\"\\n\" +\n",
    "       reduce(s=\"\", c in contexts | s + substring(c, 0, size(c)-2) +\"\\n\") as context\n",
    "RETURN context LIMIT 1`\n",
    "\n",
    "const getInformation = async (entity: string) => {\n",
    "    try {\n",
    "        const data = await graph.query(descriptionQuery, { candidate: entity });\n",
    "        return data[0][\"context\"];\n",
    "    } catch (error) {\n",
    "        return \"No information was found\";\n",
    "    }\n",
    "    \n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bdecc24b-8065-4755-98cc-9c6d093d4897",
   "metadata": {},
   "source": [
    "You can observe that we have defined the Cypher statement used to retrieve information.\n",
    "Therefore, we can avoid generating Cypher statements and use the LLM agent to only populate the input parameters.\n",
    "To provide additional information to an LLM agent about when to use the tool and their input parameters, we wrap the function as a tool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "f4cde772-0d05-475d-a2f0-b53e1669bd13",
   "metadata": {},
   "outputs": [],
   "source": [
    "import { StructuredTool } from \"@langchain/core/tools\";\n",
    "import { z } from \"zod\";\n",
    "\n",
    "const informationInput = z.object({\n",
    "    entity: z.string().describe(\"movie or a person mentioned in the question\"),\n",
    "});\n",
    "\n",
    "class InformationTool extends StructuredTool {\n",
    "    schema = informationInput;\n",
    "\n",
    "    name = \"Information\";\n",
    "\n",
    "    description = \"useful for when you need to answer questions about various actors or movies\";\n",
    "\n",
    "    async _call(input: z.infer<typeof informationInput>): Promise<string> {\n",
    "        return getInformation(input.entity);\n",
    "    }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff4820aa-2b57-4558-901f-6d984b326738",
   "metadata": {},
   "source": [
    "## OpenAI Agent\n",
    "\n",
    "LangChain expression language makes it very convenient to define an agent to interact with a graph database over the semantic layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "6e959ac2-537d-4358-a43b-e3a47f68e1d6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import { ChatOpenAI } from \"@langchain/openai\";\n",
    "import { AgentExecutor } from \"langchain/agents\";\n",
    "import { formatToOpenAIFunctionMessages } from \"langchain/agents/format_scratchpad\";\n",
    "import { OpenAIFunctionsAgentOutputParser } from \"langchain/agents/openai/output_parser\";\n",
    "import { convertToOpenAIFunction } from \"@langchain/core/utils/function_calling\";\n",
    "import { ChatPromptTemplate, MessagesPlaceholder } from \"@langchain/core/prompts\";\n",
    "import { AIMessage, BaseMessage, HumanMessage } from \"@langchain/core/messages\";\n",
    "import { RunnableSequence } from \"@langchain/core/runnables\";\n",
    "\n",
    "const llm = new ChatOpenAI({ modelName: \"gpt-3.5-turbo\", temperature: 0 })\n",
    "const tools = [new InformationTool()]\n",
    "\n",
    "const llmWithTools = llm.bind({\n",
    "    functions: tools.map(convertToOpenAIFunction),\n",
    "})\n",
    "\n",
    "const prompt = ChatPromptTemplate.fromMessages(\n",
    "    [\n",
    "        [\n",
    "            \"system\",\n",
    "            \"You are a helpful assistant that finds information about movies and recommends them. If tools require follow up questions, make sure to ask the user for clarification. Make sure to include any available options that need to be clarified in the follow up questions Do only the things the user specifically requested.\"\n",
    "        ],\n",
    "        new MessagesPlaceholder(\"chat_history\"),\n",
    "        [\"human\", \"{input}\"],\n",
    "        new MessagesPlaceholder(\"agent_scratchpad\"),\n",
    "    ]\n",
    ")\n",
    "\n",
    "const _formatChatHistory = (chatHistory) => {\n",
    "    const buffer: Array<BaseMessage> = []\n",
    "    for (const [human, ai] of chatHistory) {\n",
    "        buffer.push(new HumanMessage({ content: human }))\n",
    "        buffer.push(new AIMessage({ content: ai }))\n",
    "    }\n",
    "    return buffer\n",
    "}\n",
    "\n",
    "const agent = RunnableSequence.from([\n",
    "    {\n",
    "        input: (x) => x.input,\n",
    "        chat_history: (x) => {\n",
    "            if (\"chat_history\" in x) {\n",
    "                return _formatChatHistory(x.chat_history);\n",
    "            }\n",
    "            return [];\n",
    "        },\n",
    "        agent_scratchpad: (x) => {\n",
    "            if (\"steps\" in x) {\n",
    "                return formatToOpenAIFunctionMessages(\n",
    "                    x.steps\n",
    "                );\n",
    "            }\n",
    "            return [];\n",
    "        },\n",
    "    },\n",
    "    prompt,\n",
    "    llmWithTools,\n",
    "    new OpenAIFunctionsAgentOutputParser(),\n",
    "])\n",
    "\n",
    "const agentExecutor = new AgentExecutor({ agent, tools });"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "b0459833-fe84-4ebc-9823-a3a3ffd929e9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{\n",
       "  input: \u001b[32m\"Who played in Casino?\"\u001b[39m,\n",
       "  output: \u001b[32m'The movie \"Casino\" starred James Woods, Joe Pesci, Robert De Niro, and Sharon Stone.'\u001b[39m\n",
       "}"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "await agentExecutor.invoke({ input: \"Who played in Casino?\" })"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Deno",
   "language": "typescript",
   "name": "deno"
  },
  "language_info": {
   "file_extension": ".ts",
   "mimetype": "text/x.typescript",
   "name": "typescript",
   "nb_converter": "script",
   "pygments_lexer": "typescript",
   "version": "5.3.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
